General Game Learning 


Matthew Fisher 


What is general game learning? 


Learning to play games you have never seen 
“ Reading or listening to the rules 

“ Asking questions about the rules 

“ Learning the rules by playing (ex. Tetris) 

“ Figuring out the objective 

“ Being reprimanded for making illegal moves 

“ Learning from mistakes 


“ Deciding on good strategies 


e The focus of 99% of game Al research 


Goal of this talk 


Convince you that general game learning is: 
“ Important 
= Interesting 
“ Possible 


Why is GGL specifically interesting? 


Task Automation 
“ Automating real tasks is a big Al goal (folding 
clothes, driving cars, SQL-injection attacks) 
e manually code specialized controllers for each task 
e have controllers that can learn a task from examples 
“ Real tasks are hard to setup and evaluate 
“ Digital games are plentiful, complex, accessible to 
a computer, and easily evaluated 


“To a computer there is no difference between the 
real world and a good simulation 


Games 


Why are games a great task? 
“ Games have a clear goal that can be quantified 
“ Games are challenging to understand 
* Games require complex reasoning 
* Games require often incomplete rule-learning 
“ Digital games are fully accessible to computers 
“ Games are one way humans and computers interact 
= Games are fun and popular 
* Deep Blue, Watson 
“ The dataset of available games is huge and untapped 


We will use digital games to study... 


“ Unsupervised object segmentation and categorization 
“ Rule learning 


“ Knowledge transfer between categories 


* Between Mario & Luigi 
“ Goal inference 
“ Game state value estimation 
“ Action planning 
“ Knowledge transfer between games 


* Between Super Mario 1 and Super Mario 3 
e Between Shogi and Chess 


Artificial intelligence 


Goal is to solve problems 
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(Some) Tasks in General Learning 


“ Define meaningful internal representation 


e Categorize, classify, and learn from raw data 
“ Process raw signal 
“ Map raw signal onto internal representation 
® Solve problem using the internal representation 
“ Map solution to an action and execute it 


Representation is everything 


Trying to learn rules without a 
good representation is impossible 


= Imagine trying to learn the rules of [eile ש‎ ₪ 5 ale 
Chess at the level of pixels or 2x2 Ant å å å å å 


blocks 


= We can guess that one 
representation is better than 
another if it is easier to learn rules ê ê 
about how the environment 5 
behaves using the representation 


Real-world 
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Human Game Learning Process 
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Chess 


From Wikipedia, the free encyclopedia 


This article is about the Westem board game. For other chess games or other 


uses, see Chess (disambiguation). 


Chess is a two-player board 
game played on a chessboard, a 
square-checkered board with 

64 squares arranged in an eight- 
by-eight grid. It is one ofthe 
world's most popular games 
played by millions of people 
worldwide at home, in clubs, 
online, by correspondence, and in 
tournaments 


Each player begins the game 


Chess 
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Real-world Game Learning 


Research is bottlenecked by 

“ 3D object segmentation and categorization 

“ 3D environment reconstruction 

“ Natural language processing 

Human game learning in the physical world 
still provides us with useful insights 


“ Constructing a good representation for rule 
learning and action planning is rarely challenging 
to humans 


Existing Game Research 


Game Al Research 


An Al that plays Chess well is only the summit 
of Mount Everest 
= It assumes that you already have a perfect game 

representation and understanding of the rules 
that define this representation 

General Game Playing (Michael Genesereth) 

generalizes learning strategy across games but 

ignores learning the game representation 


Forcing General Game Learning 


A few people have tried general game learning 


“ Dramatically reduce space of games under 
consideration (ex. RPS, TTT variants) 

= Manually teach system all relevant categories and 
train classifiers for them 

* the largest number of categories that has been 
successfully used is 3 

= Focus is on robots interacting with humans, not so 

much on game learning 


State of the art (2011) 


= People have taught robots to observe two entities 
playing tic-tac-toe or rock-papers-scissors in 
contrived settings and then attempt to play these 
games 


Digital-world 
General Game Learning 


What games 


Many different types of games 
= Real-time (Mario) vs. turn-based (Reversi) 
“ Discrete (Tetris) vs. continuous (Space Invaders) 


“ Complete information (FreeCell) vs. incomplete 
information (Poker) 


“ Single player (Zelda) vs. competitive (Chess) 


For now, focus is on 2D, sprite-based games 
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(One Possible) General Game 
Learning Pipeline 


Inputs 


Al gets a video feed of the game 


Al can take actions as if it were a human (ex. 
with the keyboard, mouse, or game controller) 
Al may have access to: (semi-supervised) 

= Human playthroughs of the game 


“ Partial annotations of playthroughs denoting 
specific objectives (ex. "reach the final boss”, 
“don't die”) 


Pipeline Strategy 


Construct many possible representations 
= Each segmentation of the input video feed leads 
to different symbolic representations 
“ Attempt to learn rules on each symbolic 
representation independently 
“ The representation which leads to the simplest 


rules and best predicts future states of the game is 
the best representation 


Game Learning Pipeline 


“ Map video feed of the game onto a candidate 
symbolic representation 


“ Decide what kind of actions are possible 


=" Learn a model for how symbolic representation 
evolves as a function of the action taken 


“ Determine the goal of the game 

“ Plan actions that accomplish the goal 
Specializing for a given game is “easy” — our 
focus is on generalizing learning across games 
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1. Map to symbolic representation 


Observable State 
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Hidden Variables 
WhitePlayer has control 


The black rooks and black king have not moved 
No En passant is possible 


2. What actions are possible 


3. What rules govern the game 


0 


How does the game state evolve in response 


to actions and time? 


C:\Code\projects_v2\GamePlayer\Player\..\Bin\Player.exe = 
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4. Determine the goal 


dy Chess Titans 


Game Help 


What do you want to do? 


> End Game 


+ Return and try again 


Use Undo to reverse your last moves and try again. 


This game has been counted as a loss in your statistics. If you 
return to the game, the loss will not be counted again. 


Ill: lisie 


5, Execute actions leading to goal 


Game State Value 
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Imperfect game models 


Learning at each stage in this pipeline is going 
to be hard, but it doesn't need to be perfect 
“ You can play a fine game of Chess if you don't 
know the en passant or castling rules 


“ The rules that occur often are typically the ones 
that are most important 


Pipeline Stages 


Stage 1: Extract Symbolic 
Representation 


Extract Symbolic Representation 


Co-segmentation 


“ Assume the set of input images contain many 
instances of similar objects 


“ Simultaneously segment the images and extract a 
set of commonly-occurring templates 
Template matching 


“ Use template matching to isolate the instances 
and extract a set of representations 


Template Matching 
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Human Knowledge 


The Al cannot be expected to re-derive the 
meaning of human language and numerals 
“ Search for dominant patterns among the symbol 
sets (grids, graphs, text) 
® When certain patterns are detected (ex. a string of 
glyphs) use character recognition to replace them 
with a related symbolic form (ex. “Score” or 
9000") 
“ One goal of GGL is to discover what patterns are 
necessary and which can be learned 


Stage 2: Determine Valid Actions 


Discrete input 


Near-continuous input 
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Stage 3: Learn a Game Model 


Game Model 


A game model encodes all the relevant rules 
of the game 
“ Defines which actions are legal (Chess, Reversi) 


= Takes the current state and a user action and 
returns the next state 


“ Need to learn from observations and experiments 
“ Markov decision process 


Games are complicated but the behaviors of 
subcomponents is often simple 


Simple Entity Behavior 
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Natural Language Rules 


Chess 

= A player can move a rook or queen along any 
horizontal or vertical straight line so long as the 
intervening pieces are all empty 

“ Players can only move pieces of their color 

= It is legal for white player to move a white pawn 
forward two squares if it is in its home row and 
the two squares in front of the pawn are empty 


Natural Language Rules 


Super Mario Bros. 
“ Pressing right moves Mario to the right 
= זו‎ Mario falls into a pit or fire he will die 


“ Goombas move slowly in one direction and can 
fall off ledges 


= |f Goombas hit an obstacle they will flip directions 

“ Jumping on Goombas kills them but colliding with 
them from the side will hurt Mario 

= When Mario reaches a flag at the end of a stage 
he advances to the next stage in that world 


Models encode rules 


Need to specify a language that can encode 
game rules 


= Complex enough to model important game 
behavior 


= Simple enough to be learned from observations 
Genre-specific vocabulary 

“ Piece, grid, board location, move 

“ Collision, velocity, parabola, atlas/map, portal 


Reversi 


» « Reversi 


Game Move Help 


Black: 5 
White: 3 


Current: EN 


Position 


Game Model Languages 


Legal(whitePlayer, place(x), 
line(x,n, m) € blackPiece && 
lineEndPt(x, n, m) € whitePiece) 


“It is legal for white to place at a coordinate x if the line 
starting at x parameterized by n and m contains only black 
pieces and the end point of this line is a white piece for all 
board locations x and certain values of n and m” 


Terminal(countPieces(whitePiece) + 
countPieces(blackPiece) == 64) 


Learning Rules and Representation 


Rules that apply generally are more likely than a 
bunch of special cases 


“ Information gain criteria 
“ Occam's razor 


Categories are important 


“ Two entities probably belong to the same category if 


most of the rules that apply to entity A apply to 
entity B 


Learning Rules and Representation 


A representation is good if it is easy to build 
rules that describe it 


= We can use this to decide between many possible 
candidate representations 
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Number of Training Games 


Stage 4: Determine the Goal 


Giveaway Checkers 


Giveaway Checkers 


How can we tell if we are playing regular 
Checkers or Giveaway Checkers? 


= |f we have observations of humans playing the 
game, what value are they placing on their pieces? 

= |f we have an Al (or human) opponent, are they 
trying to capture our pieces or not? 


State Value Inference 


True Goal of Chess 
“ Construct a situation in which your opponent is 
unable to prevent you from capturing their king 
Inferred Value Function 


“ Capture enemy pieces, keep your own pieces from 
being captured 

“ King=20, Queen=10, Rook=6, Bishop=4, Knight=3, 
Pawn=1 

= We can play the game just fine using this value 
function, in fact, it is ultimately more useful 


State Value Inference 
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General Values 


Novelty 


“ Seeing the death screen 10,000 times is not as 
interesting as exploring the world. If you explore a 
game and manage to see the ending, you've won. 


Canonical directions 


“In many platformer games “proceed to the right, 
and don't die” is a good approximation of the goal 


Number of actions 


“ Maximizing the number of actions available isa 
good idea in many piece-type games 


State Value Inference 
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Level, lines, score, life 


See what humans appear to be optimizing 


Try to optimize each possible number on many 
different passes, see which results in the best 
behavior (ex. exploration criteria) 


Prior work 


Value function inference for specific games is 
often used to derive information from experts 
“ Value function for Chess was parameterized with 

approximately 8,000 parameters 
“ Parameter values were learned by analyzing 
thousands of grandmaster Chess games 
Goal is to use this idea on learned game 
models and construct “general 
parameterizations” of the value function 


Stage 1 — 4 Summary 
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Stage 5: Plan Actions 


Execute actions leading to goal 


Different games are solved in different ways 


“ Minimax tree search (Chess, Chinese Checkers) 


* Requires value function: 
— Use one modeled from examples in Stage 4 
— Use one derived from General Game Playing 


“ Motion planning (Mario, Tetris) 
e Try many different state value functions 


* Reinforcement learning techniques 
— Markov decision process formulation 


Execute actions leading to goal 


This is what General Game Playing is for 
“ Active area of research 
“ CS227B, Michael Genesereth 
“ Not designed for incomplete game models 


Uses Game Description Language 


“ Mapping from a symbolic representation of the game 
state to a list of actions and the resulting states 


= Many general game players have been developed 
* Bandit-based Monte Carlo 


“ Can easily convert game model to GGP 


Execute actions leading to goal 


legal(Y,mark(M,N)) <= next(cell(M,N,x)) <= goal(white,100) <= line(x) 
true(cell(M,N,b)) & does(white,mark(M,N)) & goal(white,50) <= 
true(control(Y)) true (cell(M,N,b)) “line(x) & 
“line(o) & 
legal(white,noop) <= next (cell(M,N,0)) <= “open 
true(cell(M,N,b)) & does (black ,mark(M,N)) € goal(white,0) <= line(o) 
true (control (black) ) true(cell(M,N,b)) goal(black,100) <= line(o) 
goal(black,50) <= 
legal(black,noop) <= next(cell(M,N,W)) <= “line(x) & 
true(cell(X,Y,b)) & true(cell(M,N,W)) & ”line(o) 8 
true(control(white)) distinct(W,b) “open 


goal(black,0) <= line(x) 
init(cell(1,1,b)) 
init(cell(1,2,b)) 
init(cell(1,3,b)) 
init(cell(2,1,b)) 


init(cell(2,2,b)) terminal <= line(x) 
init(cell(2,3,b)) terminal <= line(o) 
init(cell(3,1,b)) terminal <= “open 


init(cell(3,2,b)) 
init(cell(3,3,b)) 
init(control(white)) 


Typical general game player 
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5, Execute actions leading to goal 


General Game Playing is not going to help 
much for Mario or Zelda 
Can use basic search and control algorithms 
“ Relies on a reasonably good model of the state 
space transitions for the game 
Try and try again 
“ Millions of deaths are fine as long as it eventually 


learns; once it does we can look at how to make it 
learn faster or better 


Mario AI Competition 


MARIO TOZA COINS ליוו דיזי ות‎ TIME 
11 19 14 
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Conclusions 


Results: Deliverables are easy 


Viable game learners answer a lot of questions 
“ Can we determine what components are meaningful? 
= What games are hard to learn? 
= How many games do you need to learn rules? 


“ Can vve construct a usetul set of rules that generalize 
across many games? 

= How well can we play even if we only know some of 
the rules? 


“ Can we compute a useful reward function from 
watching human players? 


We will use digital games to study... 


“ Unsupervised object segmentation and categorization 
“ Rule learning 


“ Knowledge transfer between categories 


* Between Mario & Luigi 
“ Goal inference 
“ Game state value estimation 
“ Action planning 
“ Knowledge transfer between games 


* Between Super Mario 1 and Super Mario 3 
e Between Shogi and Chess 


First steps: NES games 
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Lots and lots of 2D games (>100,000) 
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ACHIEVEMENTS COMMUNITY DEVELOPERS HELP 


New TopRated MyFavorites MMO Strategy/Defense Adventure/RPG Shooter Puzzle Action Multiplayer More Action Adventure Arcade Shooting Puzzle & Skill Strategy Sports Misc 


Browse Games Search Armor Games B EZ BE Endless Migration Submit Games News Community Store Blog Help 
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Highest rating‏ "ד 


Action 
Multiplayer 
Shoot 

nture & RPG 
Sports & Racing 
Strategy & Defensi 


Learn to Fly 2 


by light bringe 7 ac 


Jun. 16, 2011 You were able to 
learn how to fly, but Icebergs 
stopped you and crushed your 
dreams. Now... play now » 


Cursed Treasure: Don't... 

by IrivSoft © KATA: 
May. 06, 2010 | Protect your gems 
from being stolen by “good” heroes 
in this tower defense game... 

play now » 


Cursed Treasure: Level... 

by IrivSoft E) HA 
May. 06, 2011 | Level pack for the 
popular tower defense game Cursed 
Treasure. Take on the role as the 
evil... play now > 


Elephant Quest 


2,562,376 play: 


UPGRADE COMPLETE! 

by ArmorGames E) AR 
Jun. 30, 2009 This game has 
crummy graphics... UNTIL YOU 
UPGRADE THE GRAPHICS ENGINE! 
And no sound? UNTIL... play now » 


Bloons Tower Defense 4 

by Ninjakiwi © PA 
Jan. 26, 2010 | BTD4 features 
improved graphics, loads of new and 
original tower types and tons of 
upgrades for... play now = 


M rmind: Jal ng... 

by theswain E) POP 
Feb. 01, 2009 | Manage your 
minions, defend your base, and 
conquer this puny planet ONCE AND 
FOR ALL! play now > 


B Random Game 


# Site News 


Raze 2 Launches on Armor 
Games! 

9.22.2011 

One ofthe most anticipated sequels in 
Armor Games history has arrived! Raze 
2 brings the heat with tons ofnew 
weapons and scenarios. Come join the 
fight! Pew pew pew! 


Read more news » 


Ei New Games 


Es & Popular Games 


Raze 2 The Last Stand - 
375,645 plays Union City 
Rating: 9.4110 4,590,419 plays 


Rating: 9.4/10 


må | Space Punk Racer 
an) 9,389 plays 
POMOS Rating: 5.7110 


Monster Island 3 Roly-Poly Eliminator 2 i 

106,333 plays | A na Sul Sh 

Rating: 6.6110 4 (65 Rating: 7.4/10 Rating: 9.6/10 
ës 


Learn to Fly 2 


2,670,850 plays 
Rating: 9.5/10 


Genestealer Revenge Siege Hero - Viking 
113,857 plays Vengeance 
Rating: 5.4/10 376,409 plays 

Rating: 7.5710 


Dead Metal Wonderputt 
257,142 plays 481,426 plays 
Rating: 6.6/10 Rating: 8.7/10 


The Kings League 
573,851 plays 
Rating: 8.5/10 


Crush the Castle 2 
Players Pack 
3,203,622 plays 

Rating: 9.3/10 


! 
À 


Zombocalypse 
861,733 plays 
Rating: 8.2/10 


GemCraft Labyrinth 
11,051,406 plays 
Rating: 9.3/10 


by ArmorGames E) A Villainous = Faultline Coinbox Hero Flight 
Apr. 13, 2011 | The fight is on! by Rete O FATA & Player Ranking NE rio 251.485 piye 1,512 plays 
> Rating: 7.9/10 Rating: 8.6/10 Rating: 9.3/10 
Wooly has = your precious Jun. 29, 2011 | From the makers of 7 0 
owler cap and now you are on a “I Have 1 Day” and "Don't Shit Yi retail_madness ‚903 
1,744,249 play: ₪ Ernie15 41,723 
Fm ET aaa të Game Tags 
ROLO Zega 35,536 
by sarahnorthway ©) HA HT GemCraft i 
Feb. 12, 2011 | Gather survivors of by gameinabottle E) יאאאא‎ Roadripper 27,565 action adventure ball battle click 
the zombie apocalypse and manage 
food supplies, housing and morale ai Lirio - per וו‎ defense fun mouse physics 
3,334,550 plavs while... play now» and you are one of those few wizards Carlie 23,806 | 
7,620,260 plays 4 ist -- platform platformer point |] 6 
2 E uick retro rpg shoot shooter shooting 
r 24 q 
Infectonator : World D.. 0 23502 space strategy tower war zombie 
y TogeProductions EN sokk Sonny 2 Xcalibur45 22,916 
Feb. 14, 2010 | To fix the freezing by ArmorGames E) HH View full ranking » 
when the loading screen comes up, i Apr. 27, 2009 | Sonny 2 is a combat 
just hit tab until there is a yellow based RPG Where you play as å 
square... play now » 0 prod שו‎ Zombie, level up and gain items to 
advance to... play now > Adventure Arcade Shooting Puzzle &Skill Strategy Sports Misc 
Epic Battle Fantasy 3 
by kupo707 ED ki Learn to Fly Siege Hero - Viking Exit Path 2 The Last Stand - Zombotron 
Sep. 01, 2010 | Battle over 70 by light brinqer777 ₪ KA Ven! eance 569 plays në Union City 1,337,832 plays 
types of monsters, collect over 80 May. 16; 2009 | Grab your rackets as 4 oes Rating: 9/10 J = = plays Rating: 8 
types of equipment and use over 80 and glider to show the world a ating: 7.5/ ningas 
different... play now » penguin can fly ! play now > 
EN City Siege 2: Resort Crush the Castle 2 Armor Mayhem Achievement 
Siege Fa) Players Pack 5,581,645 plays Unlocked 2 
Gemcraft Labyrinth 1,935,874 plays Ed 3,203,622 plays Rating: 9.2710 3,557,004 plays 
BI Rating: 8.910 Rating: 9.3/10 A Rating: 9.1710 
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3,261,207 plays 


«first «prev 


by gameinabottle E) RES 


Apr. 20, 2011 | After decades of 
preparation, the test you've been 
waiting for, the Labyrinth, has 
finally... play now = 


1-15 next» | last» 


Echoes - Operation 
Stranglehold 
4,812,866 plays 

Rating: 8.540 


Elona Shooter Toss the Turtle 
5,405,427 plays \ 7,380,507 plays 
Rating: 8.7/10 Rating: 9.2/10 


The Last Stand 2 
21,714,907 plays 
Rating: 9.4/10 


All Action Games 


2D vs. 3D games 
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Chrono Boost: Nexus 


ChronoBoost:Chrono Boost on Nexus at 512,322 £ 


„.., Chrono Boost: Nexus QË 
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ChronoBoostNot enough energy fof ו‎ 
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